Polypeptides capable of interacting with the human topoisomerase IIIα

ABSTRACT

The invention concerns novel polypeptides capable of interacting with the human topoisomerase IIIα and the nucleic acid sequences coding for said polypeptides. The invention also concerns a method for identifying compounds capable of interacting with said polypeptides and a method for identifying molecules capable of modulating the interaction of the topoisomerase IIIα with said polypeptides.

The present invention relates to novel polypeptides capable ofinteracting with human topoisomerase IIIα and to the nucleic acidsequences encoding these polypeptides. It also relates, in addition, toa method for identifying compounds capable of interacting with saidpolypeptides and to a method for identifying molecules capable ofmodulating the interaction of topoisomerase IIIα with said polypeptides.

The replication of DNA is a complex mechanism which involves a largenumber of factors. DNA exists in the physiological state in asupercoiled form and access to the information which it containsrequires substantial modification of the degree of coiling. Replicationrequires the suppression of the supercoils, the separation of the twostrands of the DNA double helix and the maintaining of DNA insingle-stranded form.

The modification of the degree of coiling is brought about in vivo bytopoisomerases which are enzymes capable of modifying the DNAsuperstructures. It is possible to distinguish type I topoisomeraseswhich cut only one of the two DNA strands and which eliminate thesupercoils, and type II topoisomerases which act by cutting the two DNAstrands and which are capable of eliminating or creating the supercoils.Eukaryotic topoisomerases are less well known than their prokaryotichomologs and their mechanism of action has still not yet been elucidatedto date.

The separation of the two strands of a DNA duplex is catalyzed by agroup of enzymes, called DNA helicases, which act in an ATP-dependentmanner so as to produce the single-stranded DNA used as template for theDNA replication and transcription processes. Generally, the helicasesbind to the single-stranded DNA or to the junctions between the single-and double-stranded DNA, and move in a single direction along the DNA inthe double-stranded region, destroying the hydrogen bonds joining thetwo strands. All helicases exhibit a DNA-dependent ATPase (or. NTPase)activity which hydrolyzes the gamma phosphate of the ribonucleoside ordeoxyribonucleoside 5′-triphosphate and provides the energy necessaryfor the reaction. The first helicase was discovered in E. coli in 1976.Since then, more than 60 helicases have been isolated in prokaryotes andeukaryotes. The role of human helicases has still not been elucidated inmost cases, with the exception of HDHII (repair of the lesions inducedby X-rays), HDHIV (assembly of preribosomes), ERCC2 and ERCC3, which areinvolved in repair through excision and cell viability. Little is knownon the structure of these helicases. A large portion of the informationavailable on the structures and functions of helicases has been obtainedby comparative analysis of the amino acid sequences. In particular,conserved motifs have made it possible to group helicases intosubfamilies based on the sequence homologies.

Human Topoisomerase III belongs to the family of type IA topoisomerasesand therefore exhibits sequence homologies with E. coli topoisomerases Iand III, yeast Topoisomerase III as well as reverse gyrase fromarchaebacteria. Human Topoisomerase III is now called Topoisomerase IIIαso as to differentiate it from human topoisomerase IIIβ which wasrecently discovered during the sequencing of the human immunoglobulin λgene locus (Kawasaki, K., Minoshima, S., Nakato, E., Shibuya, K.,Shintani, A., Schmeits, J. L., Wang, J. and Shimizu, N. 1997, GenomeResearch 7: 250-261), and for which no functional activity has beenshown. Yeast-expressed and unpurified topoisomerase IIIα exhibits anactivity of partial relaxation of a highly negatively supercoiled DNA(Hanai, R., Caron, P. R. and Wang, J. C. 1993. Proc. Natl. Acad. Sci.USA 93: 3653-3657).

Topoisomerase IIIα is a protein of 976 amino acids and with a molecularweight of about 110 kDa. The gene encoding human Topoisomerase IIIα ispresent in a single copy on chromosome 17p11.2-12 (Hanai, R., Caron, P.R. and Wang, J. C. 1996. Proc. Natl. Acad. Sci. USA 93: 3653-3657). Amurine homolog of Topoisomerase III was recently cloned (Seki, T., Deki,M., Katada, T. and Enomoto, T. 1998. Biochim Biophys Acta 1396:127-131).

Topoisomerase IIIα exhibits a strong sequence homology with yeastTopoisomerase III, namely 44% sequence identity and 61% similarity. Thehomology which it exhibits with bacterial topoisomerases I and III isless strong, namely 24% identity and 44% similarity. However,Topoisomerase IIIα resembles E. coli Topoisomerase I more than itresembles the other members of the group of type IA topoisomerases fromthe point of view of the organization of the protein into domains.Indeed, these two polypeptides contain a C-terminal domain which has noequivalent in E. coli or yeast Topoisomerase III. This C-terminal domaincontains motifs with 4 cysteines (3 motifs for E. coli Topoisomerase Iand 1.5 motif for human Topoisomerase IIIα), as well as an extremeC-terminal domain for which a DNA-binding role has been demonstrated forE. coli Topoisomerase I.

The role of human topoisomerase IIIα in the cell has not yet beenidentified.

Human Topoisomerase IIIα appears to be essential, at least duringembryogenesis, since the knock-out of the murine homolog ofTopoisomerase IIIα is lethal (Li, W. and Wang, J. C. 1998 Proc. Natl.Acad. Sci. USA 95: 1010-1013). The messenger RNAs for Topoisomerase IIIαare present in numerous tissues (heart, brain, placenta, lung, liver,skeletal muscle, kidney, pancreas) in the form of three transcripts of7.2, 6 and 4 kilobases in size (Fritz, E., Elsea, S. H., Patel, P. I.and Meyn, M. S. 1997 Proc. Natl. Acad. Sci. USA 94: 4538-4542).

Moreover, it has been assumed that Topoisomerase IIIα plays a role inmaintaining the stability of the genome. Indeed, the cDNA CAT4.5,encoding a truncated human Topoisomerase IIIα of 141 N-terminal aminoacids, is capable of complementing the phenotype for hypersensitivity toionizing radiation in AT (Ataxia-Telangectasia) cells exhibiting amutation in the ATM gene (Fritz, E., Elsea, S. H., Patel, P. I. andMeyn, M. S. 1997 Proc. Natl. Acad. Sci. USA 94: 4538-4542).

In yeast, two independent studies have shown the existence of aninteraction between the helicase SGS1 and yeast Topoisomerase III. Onthe one hand, the sgs1- mutants are suppressers of the top3- phenotype(slow growth, hyperrecombination) in the yeast S. cerevisiae (Gangloff,S., McDonald, J. P., Bendixen, C., Arthur, L. and Rothstein, R. 1994.Mol. Cell. Biol. 14: 8391-8398). On the other hand, it has been shownthat the first 500 amino acids of SGS1 interact with yeast TopoisomeraseIII (Gangloff, S., McDonald, J. P., Bendixen, C., Arthur, L. andRothstein, R. 1994. Mol. Cell. Biol. 14: 8391-8398, Lu, J., Mullen, J.R., Brill, S. J., Kleff, S., Romeo, A. M. and Sternglanz, R. 1996.Nature 383: 678-679). However, to date, no interaction between ahelicase and human Topoisomerase IIIα has been identified.

The identification of partners of human topoisomerase IIIα thereforeconstitutes a major challenge for the understanding of the role of humantopoisomerase IIIα, and of its mechanism of action.

The present invention results from the demonstration of novelpolypeptides capable of interacting with topoisomerase IIIα (calledhereinafter polypeptide partners of topoisomerase IIIα). It also resultsfrom the discovery that these polypeptides show a strong homology withproteins which exhibit structural characteristics common to RNAhelicases and for which no function had so far been described. Thedemonstration of this interaction and of these homologies designatethese proteins as DNA helicase partners of topoisomerase IIIα. Theidentification of these partners makes it possible to envisage numerousapplications based on the combined action of these partner proteins andof topoisomerase IIIα; these applications relate in particular to:

1) The destruction of the nucleosomal structure: to undergo someprocesses such as replication, transcription, repair or recombination,DNA should be accessible to the corresponding enzymatic machineries and,to do this, the nucleosomal structure should be transiently destroyed.It is thus possible to envisage that helicase locally separates the DNAstrands and creates positive supercoils ahead of it and negativesupercoils behind it. The positive twist is absorbed by the disruptionof the nucleosomes, while the negative twist is selectively relaxed bytype IA topoisomerase.

2) The positive supercoiling of DNA: the interaction between helicaseand type IA topoisomerase is likely to reconstitute in a eukaryoticorganism the reverse gyrase activity of thermophilic archaebacteria.Indeed, it has been shown that Sulfolobus acidocaldarius reverse gyrasepossesses at the N terminus a helicase domain containing the 8 motifs ofhelicases with a “DEAD” motif, and at the C terminus a topoisomerasedomain homologous to the type IA topoisomerases (Confalonieri, F., Edie,C., Nadal, M., Bouthier de la Tour, C., Forterre, P. and Duguet, M.1993. Proc. Natl. Acad. Sci. USA 90: 4753-4757); this enzyme relaxes thenegatively supercoiled DNA and introduces positive supercoils into thecircular DNA in an ATP-dependent manner (Forterre, P., Mirambeau, G.,Jaxel, C., Nadal, M. and Duguet, M. 1985. EMBO J. 4: 2123-2128). Thiseukaryotic reverse gyrase activity can serve to eliminate particular DNAstructures such as the cruciform DNA, the Z DNA, mismatches,recombination intermediates, and the like. From these observations andfrom the demonstration that topoisomerase IIIα is capable of interactingwith a protein possessing the properties of a DNA helicase, it ispossible to envisage the production in vivo or in vitro of atopoisomerase IIIα/protein partner complex constituting an enzymaticcomplex having reverse gyrase type functions. It should be noted thatsuch a function of positive supercoiling of DNA has still never beendescribed in eukaryotes.

3) The segregation of newly replicated chromosomes: at the end of thereplication of DNA, topological problems appear at the level of thepoint of convergence of two replication forks. A mechanism which makesit possible to solve this topological problem involves the concertedaction of a helicase and a type IA topoisomerase, capable ofdecatenating two single-stranded DNA molecules. This model (Wang, J. C.1991. J. Biol. Chem. 266: 6659-6662; Rothstein, R. and Gangloff, S.1995. Genome Research 5: 421-426) proposes that at the point where tworeplication forks meet, replication is stopped, leaving portions ofentangled single-stranded DNAs. These are then separated by means of theconcerted action of helicase and topoisomerase. The synthesis of DNA isthen completed at the level of the single-stranded regions.

4) The recombination and the stability of the genome: it has been shownthat mutants of Top3- yeast or Sgs1- mutants both exhibit ahyperrecombination phenotype while Top3-/Sgs1- double mutants recover anormal phenotype. This shows that yeast Topoisomerase III and helicaseSGS1 probably act in a concerted manner to maintain a low rate ofrecombination, for example by a positive supercoiling activity of thereverse gyrase type, or by a more direct mechanism at the level of thepairings of the recombination intermediates.

Unlike the helicase SGS1, known to interact with yeast topoisomeraseIII, the protein partner of topoisomerase IIIα identified by theapplicant does not belong to the family of RecQ type helicases.

The polypeptides according to the invention show a high degree ofhomology with the sequence of a human protein DDX14 published by Chunget al (Chung, J., Lee, S-G., and Song, K. 1995. Korean J. Biochem. 27:193-197). The protein DDX14 exhibits a significant sequence homologywith an RNA helicase of murine origin; however, the helicase activity ofthis protein has not yet been demonstrated and the function of DDX14 hasnot yet been elucidated.

The polypeptides according to the invention also show a high degree ofhomology with the sequence of a human protein DBX1 published by Lahn etal (Lahn, T. and Page, D. C. 1997. Science. 278: 675-680). The proteinDBX1 encodes a protein which exhibits homologies with RNA helicases butits helicase activity has never been demonstrated and the function ofthe DBX1 protein has not yet been identified.

The DBX1 protein encodes a protein of 662 amino acids. The correspondinggene is situated on the X sex chromosome and its homolog situated on theY chromosome is 91% identical at the protein level. The nucleic andpolypeptide sequences of DBX1 are presented in the sequences SEQ ID No.5 and SEQ ID No. 6. The expression of the DBX1 gene appears to beubiquitous. It has now been demonstrated that the DBX1 protein possessesthe 8 motifs characteristic of helicases of the “DEAD” family. Moreprecisely, it belongs to the subfamily represented by the helicase PL10,and whose recorded members are the helicases DED1 and DBP1 from yeast,the helicase An3 from amphibians and the murine helicases PL10, mDEAD2and mDEAD3 (Gee, S. L. and Conboy, J. G. 1994. Gene 140: 171-177).Helicases belonging to this subfamily contain, in addition to thecentral catalytic domain containing, the 8 conserved motifs ofhelicases, particular N- and C-terminal domains. The C-terminal domainis rich in arginines and serines, which resembles the domains ofsplicing factors. However, in the case of the helicases of thissubfamily, this domain rich in arginines and serines is shorter and doesnot possess as many RS dipeptides as in the prototype domain of splicingfactors.

The invention also provides a method for identifying molecules capableof blocking the interaction between human Topoisomerase IIIα and apolypeptide partner of topoisomerase IIIα. Such a method makes itpossible to identify molecules which are in particular capable ofblocking the reverse gyrase type activity of these two proteins. Suchmolecules are useful for modulating the processes of division,replication, transcription, translation, splicing, repair orrecombination of DNA. These molecules are also capable of possessing acytotoxic type antitumor activity because of the disruption of thesebasic processes at the level of the DNA.

A first subject of the invention therefore relates to nucleotidesequences encoding polypeptides capable of interacting withtopoisomerase IIIα.

Preferably, the nucleotide sequences according to the invention encode apolypeptide comprising all or part of the polypeptide sequence describedin the sequence SEQ ID No. 4 or its derivatives.

For the purposes of the present invention, the term derived polypeptidesequence denotes any polypeptide sequence differing from the sequenceconsidered, obtained by one or more modifications of a genetic and/orchemical nature, and possessing the capacity to interact withtopoisomerase IIIα. Modification of a genetic and/or chemical nature isunderstood to mean any mutation, substitution, deletion, addition and/ormodification of one or more residues. Such derivatives may be generatedwith different aims, such as in particular that of improving its levelsof production, that of increasing its resistance to proteases or ofimproving its passage across the cell membranes, that of increasing itstherapeutic efficacy or of reducing its side effects, that of increasingthe affinity of the peptide for its site of interaction, or that ofconferring novel pharmacokinetic and/or biological properties on it.Advantageously, the variants comprise deletions or mutations affectingamino acids whose presence is not decisive for the activity of thederivative. Such amino acids may be identified for example by tests ofcellular activity as described in the examples.

Preferably still, the nucleotide sequences according to the presentinvention comprise all or part of the nucleotide sequence described inthe sequence SEQ ID No. 3 and encoding the sequence SEQ ID No. 4 or thesequences derived from this nucleotide sequence.

For the purposes of the present invention, the term derived nucleotidesequence denotes any sequence differing from the sequence consideredbecause of the degeneracy of the genetic code, obtained by one or moremodifications of a genetic and/or chemical nature, as well as anysequence hybridizing with these sequences or fragments thereof andencoding a polypeptide capable of interacting with Topoisomerase IIIα.The expression modification of a genetic and/or chemical nature isunderstood to mean any mutation, substitution, deletion, addition and/ormodification of one or more residues. The term derivative also comprisesthe sequences homologous to the sequence considered, which are derivedfrom other cellular sources and in particular from cells of humanorigin, or from other organisms. Such homologous sequences may beobtained by hybridization experiments. The hybridizations may be carriedout starting with nucleic acid libraries, using the native sequence or afragment thereof as probe, under variable hybridization conditions.

The nucleotide sequences according to the invention may be of artificialorigin or otherwise. They may be genomic sequences, cDNA, RNA, hybridsequences or synthetic or semisynthetic sequences. These sequences maybe obtained for example by screening DNA libraries (cDNA library,genomic DNA library) by means of probes produced on the basis ofsequences presented above. Such libraries may be prepared from cells ofdifferent origins by conventional molecular biology techniques known topersons skilled in the art. The nucleotide sequences of the inventionmay also be prepared by chemical synthesis or by mixed methods includingchemical or enzymatic modification of sequences obtained by thescreening of libraries. In general, the nucleic acids of the inventionmay be prepared according to any technique known to persons skilled inthe art.

The subject of the present invention is also polypeptides capable ofinteracting with topoisomerase IIIα.

For the purposes of the present invention, the name topoisomerase IIIαcovers human topoisomerase IIIα in itself as well as the homologousforms corresponding in particular to mutated forms of this protein.

Preferably, the polypeptides according to the invention comprise all orpart of the polypeptide sequence described in SEQ ID No. 4 or of itsderivatives.

The present invention also includes a polypeptide characterized in thatit is a fragment of the DBX1 protein, capable of interacting withtopoisomerase IIIα and comprising all or part polypeptide fragment whichextends between residues 318-662 and represented in the polypeptidesequence SEQ ID No. 6 or its derivatives.

The subject of the present invention is also the use of the polypeptidesaccording to the invention or of fragments of these polypeptides, forslowing down, inhibiting, stimulating or modulating the activity oftopoisomerase IIIα.

Indeed, it is possible to envisage regulating the function oftopoisomerase IIIα by means of the polypeptides according to theinvention or of fragments thereof and in particular inhibiting orslowing down the activity of topoisomerase IIIα. This modification ofthe activity of topoisomerase IIIα is capable of leading to a slowingdown of cellular growth or a blocking of the cell cycle or of inducingapoptosis.

Another subject of the present invention relates to a method forpreparing the polypeptides according to the invention according to whicha cell containing a nucleotide sequence encoding said polypeptides iscultured under conditions for expressing said sequence and thepolypeptide produced is recovered. In this case, the part encoding saidpolypeptide is generally placed under the control of signals allowingits expression in a cellular host. The choice of these signals(promoters, terminators, leader sequence for secretion, and the like)may vary according to the cellular host used. Moreover, the nucleotidesequences of the invention may form part of a vector which may beautonomously replicating or integrative. More particularly, autonomouslyreplicating vectors may be prepared using autonomously replicatingsequences in the chosen host. As regards integrative vectors, these maybe prepared, for example, using sequences homologous to certain regionsof the genome of the host, allowing, through homologous recombination,the integration of the vector.

The subject of the present invention is also host cells transformed witha nucleic acid comprising a nucleotide sequence according to theinvention. The cellular hosts which can be used for the production ofthe polypeptides of the invention by the recombinant route are botheukaryotic and prokaryotic hosts. Among the suitable eukaryotic hosts,animal cells, yeasts or fungi may be mentioned. In particular, asregards yeasts, yeasts of the genus Saccharomyces, Kluyveromyces,Pichia, Schwanniomyces or Hansenula may be mentioned. As regards animalcells, the insect cells Sf9, the cells COS, CHO, C127, of humanneuroblastomas, and the like, may be mentioned. Among the fungi,Aspergillus ssp. or Trichoderma spp. may be more particularly mentioned.As prokaryotic hosts, the use of the following bacteria E. coli,Bacillus or Streptomyces is preferred.

According to a preferred mode, the host cells are advantageouslyrepresented by recombinant yeast strains.

Preferably, the host cells comprise at least one sequence or onefragment of a sequence chosen from the nucleotide sequences SEQ ID No. 3or SEQ ID No. 5, for the production of the polypeptides according to theinvention.

The nucleotide sequences according to the invention may be incorporatedinto viral or nonviral vectors, allowing their administration in vitro,in vivo or ex vivo.

Another subject of the invention relates, in addition, to any vectorcomprising a nucleotide sequence encoding a polypeptide according to theinvention. The vector of the invention may be for example a plasmid, acosmid or any DNA not encapsulated by a virus, a phage, an artificialchromosome, a recombinant virus, and the like. It is preferably aplasmid or a recombinant virus.

As viral vectors in accordance with the invention, there may be mostparticularly mentioned vectors of the adenovirus, retrovirus,adeno-associated virus, herpesvirus or vaccina virus type. The subjectof the present application is also defective recombinant virusescomprising a heterologous nucleic sequence encoding a polypeptideaccording to the invention.

Another subject of the invention consists in polyclonal or monoclonalantibodies or antibody fragments directed against a polypeptide asdefined above. Such antibodies may be generated by methods known topersons skilled in the art. In particular, these antibodies may beprepared by immunizing an animal against a polypeptide whose sequence ischosen from the sequences SEQ ID No. 4 or SEQ ID No. 6 or any fragmentor derivative thereof, and then collecting blood and isolatingantibodies. These antibodies may also be generated by preparinghybridomas according to techniques known to persons skilled in the art.The antibodies or antibody fragments according to the invention may inparticular be used to inhibit and/or reveal the interaction betweentopoisomerase IIIα and the polypeptides as defined above.

Another subject of the present invention relates to a method foridentifying compounds capable of binding to the polypeptides accordingto the invention. The identification and/or isolation of these compoundsor ligands may be carried out according to the following steps:

a molecule or a mixture containing various molecules, optionallyunidentified, is brought into contact with a polypeptide of theinvention under conditions allowing the interaction between saidpolypeptide and said molecule in the case where the latter might possessaffinity for said polypeptide, and,

the molecules bound to said polypeptide of the invention are detectedand/or isolated.

According to a particular mode, such a method makes it possible toidentify molecules capable of blocking the helicase type activity, inparticular the DNA helicase activity of the DBX1 protein or of thepolypeptides according to the invention and thus modulate the processesof division, replication or transcription of DNA. These molecules arecapable of possessing a cytotoxic type antitumor activity because of thedisruption of these basic processes at the level of the DNA.

In this regard, another subject of the invention relates to compounds orligands capable of binding to the polypeptides according to theinvention and capable of being obtained according to the method definedabove.

Another subject of the invention relates to the use of a compound or ofa ligand identified and/or obtained according to the method describedabove as a medicament. Such compounds are indeed capable of being usedfor the prevention, improvement or treatment of certain conditionsinvolving a cell cycle dysfunction.

The subject of the invention is also any pharmaceutical compositioncomprising, as active ingredient, at least one ligand obtained accordingto the method described above.

Another subject of the present invention relates to a method ofidentifying compounds capable of modulating or of completely orpartially inhibiting the interaction between topoisomerase IIIα and thepolypeptides according to the invention or the DBX1 protein.

The identification and/or isolation of modulators or ligands capable ofmodulating or of completely or partially inhibiting the interactionbetween topoisomerase IIIα and the polypeptides according to theinvention or the DBX1 protein may be carried out according to thefollowing steps:

the binding of topoisomerase IIIα or of a fragment thereof to apolypeptide according to the invention is carried out;

a compound to be tested for its capacity to inhibit the binding betweentopoisomerase IIIα and the polypeptides according to the invention isadded;

it is determined whether topoisomerase IIIα or the polypeptidesaccording to the invention are displaced from the binding or preventedfrom binding;

the compounds which prevent or which impede the binding betweentopoisomerase IIIα and the polypeptides according to the invention aredetected and/or isolated.

In a particular mode, this method of the invention is suited to theidentification and/or isolation of agonists and antagonists of theinteraction between topoisomerase IIIα and the polypeptides of theinvention. Still according to a particular mode, the invention providesa method for identifying molecules capable of blocking the interactionbetween human Topoisomerase IIIα and the helicase DBX1.

Such a method makes it possible to identify molecules capable ofblocking the reverse gyrase type activity of these two proteins and thusmodulate the processes of division, replication, transcription,translation, splicing, repair or recombination of DNA. These moleculesare capable of possessing a cytotoxic type antitumor activity because ofthe disruption of these basic processes at the level of the DNA.

In this regard, another subject of the invention relates to compounds orligands capable of interfering at the level of the interaction betweentopoisomerase IIIα and the polypeptides according to the invention orthe DBX1 protein and which are capable of being obtained according tothe method defined above.

The invention also relates to the use of a compound or of a ligandidentified and/or obtained according to the method described above as amedicament. Such compounds are indeed capable of being used for theprevention, improvement or treatment of certain conditions involving acell cycle dysfunction.

The subject of the invention is also any pharmaceutical compositioncomprising, as active ingredient, at least one ligand obtained accordingto the method described above.

Other advantages of the present invention will emerge from reading theexamples which follow and which should be considered as illustrative andnonlimiting.

LEGEND TO THE FIGURES

FIG. 1: This figure represents the beginning and the end of the sequenceSEQ ID No. 1 so as to present the introduction of the BamHI and SalIsites in 5′ and 3′ of the topoisomerase IIIα coding sequence and theposition of the XhoI and HindIII sites.

MATERIALS AND METHODS 1) General Molecular Biology Techniques

The methods conventionally used in molecular biology such as preparativeextractions of plasmid DNA, centrifugation of plasmid DNA in cesiumchloride gradient, electrophoresis on agarose or acrylamide gels,purification of DNA fragments by electroelution, phenol orphenol-chloroform extractions of proteins, precipitation of DNA insaline medium with ethanol or isopropanol, transformation in Escherichiacoli, and the like, are well known to persons skilled in the art and areabundantly described in the literature [Maniatis T. et al., “MolecularCloning, a Laboratory Manual”, Cold Spring Harbor Laboratory, ColdSpring Harbor, N.Y., 1982; Ausubel F. M. et al. (eds), “CurrentProtocols in Molecular Biology”, John Wiley & Sons, New York, 1987].

For the ligations, the DNA fragments may be separated according to theirsize by electrophoresis on agarose or acrylamide gels, extracted withphenol or with a phenol/chloroform mixture, precipitated with ethanoland then incubated in the presence of T4 phage DNA ligase (Biolabs)according to the supplier's recommendations.

The filling of the protruding 5′ ends may be carried out with the Klenowfragment of E. coli DNA Polymerase I (Biolabs) according to thesupplier's specifications. The destruction of the protruding 3′ ends iscarried out in the presence of the T4 phage DNA Polymerase (Biolabs)used according to the manufacturer's recommendations. The destruction ofthe protruding 5′ ends is carried out by a controlled treatment with S1nuclease.

Mutagenesis directed in vitro by synthetic oligodeoxynucleotides may becarried out according to the method developed by Taylor et al. [NucleicAcids Res. 13 (1985) 8749-8764] using the kit distributed by Amersham.

Enzymatic amplification of DNA fragments by the so-called PCR technique[Polymerase-catalyzed Chain Reaction, Saiki R. K. et al., Science 230(1985) 1350-1354; Mullis K. B. and Faloona F. A., Meth. Enzym. 155(1987) 335-350] may be carried out using a “DNA thermal cycler” (PerkinElmer Cetus) according to the manufacturer's specifications.

The verification of the nucleotide sequences may be carried out by themethod developed by Sanger et al. [Proc. Natl. Acad. Sci. USA, 74 (1977)5463-5467] using the kit distributed by Amersham.

2) The Yeast Strains Used are

The strain yCM17 of the genus S. cerevisiae (MATa, ura3-52, his3-200,ade2-101, lys2-801, trp1-901, leu2-3,112, canr, gal4-542, gal80-538,URA3::GAL1/10-lacZ-URA3) was used as tool for screening the library forfusion of Hela cells by the two-hybrid system.

The strain L40 of the genus S. cerevisiae (MATa, his3D200, trpl-901,leu2-3,112, ade2, LYS2::(lexAop)4-HIS3, URA3:(lexAop)8-LacZ, GAL4) wasused to verify the protein-protein interactions when one of the proteinpartners is fused with the LexA protein. The latter is capable ofrecognizing the LexA response element controlling the expression of theLacZ and His3 reporter genes.

They were cultured on the following culture media:

Complete YPD medium: yeast extract (10 g/l) (Difco), bactopeptone (20g/l) (Difco), glucose (20 g/l) (Merck). This medium was made solid byaddition of 20 g/l of agar (Difco).

Minimum YNB medium: Yeast Nitrogen Base (without amino acids) (6.7 g/l)(Difco), glucose (20 g/l) (Merck). This medium may be made solid byaddition of 20 g/l of agar (Difco). This medium is supplemented withamino acids or nitrogen bases (50 mg/ml) which are necessary to bringabout the growth of auxotrophic yeasts. Ampicillin (100 μg/ml) is addedto the medium so as to avoid bacterial contaminations.

3) The Bacterial Strains Used are

The Escherichia coli TG1 strain of the supE, hsdΔ5, thi, Δ(lac-proAB),F′[tra D36 pro A⁺B⁺ lacI^(q) lacZΔM15] genotype was used for theconstruction of the plasmids pLex-TopoIIIα and pGBT-TopoIIIα.

The Escherichia coli HB101 strain of the supE44, ara14, galK2, lacY1,Δ(gpt-proA)62, rpsL20(Str^(r)), xyl-5, recA13, Δ(mcrC-mrr), HsdS⁻(r⁻m⁻)gentotype was used as means for amplifying and isolating plasmidsobtained from the Hela cell cDNA library.

The TG1 strain was cultured on LB medium: NaCl (5 g/l) (Difco),bactotryptone (10 g/l) (Difco), yeast extract (5 g/l) (Difco). Thismedium may be made solid by adding 20 g/l of agar (Difco). Ampicillinwas used at 100 μg/l for the selection of bacteria which have receivedplasmids carrying, as marker, the gene for resistance to thisantibiotic.

The HB101 strain was cultured on M9 medium: Na2HPO4 (7 g/l) (Sigma),KH2PO4 (3 g/l) (Sigma), NH4Cl (1 g/l) (Sigma), NaCl (0.5 g/l) (Sigma),glucose (20 g/l) (Sigma), MgSO4 (1 mm) (Sigma), thiamine (0.001%). Thismedium is made solid by adding 15 g/l of agar (Difco).

Leucine (50 mg/l) (Sigma) and proline (50 mg/l) (Sigma) are added to theM9 medium to allow growth of the HB101 strain. During the selection ofplasmids obtained from the Hela cell two-hybrid cDNA library, leucinewas not added to the medium because the plasmids carry a Leu2 selectionmarker.

3) The Plasmids Used are

Vector pGBT9 (+2): this plasmid is derived from the plasmid pGBT9(Clontech). It exhibits a reading frame shift of +2, upstream of theEcoR1 site, in the zone corresponding to the multiple cloning site. Thedifference in sequence between pGBT9 (+2) and pGBT9, upstream of theEcoRI site (underlined), is represented in bold below:

SEQ ID No. 7 pGBT9 (+2):

TCG CCG GAA TTG AAT TCC CGG GGA TCC GT

SEQ ID No. 8 pGBT9:

TCG CCG GAA TTC CCG GGG ATC CGT

The vector PGBT9 (+2) is a shuttle plasmid of 5.4 kb which possesses abacterial and yeast replication origin allowing it to replicate in ahigh copy number in these two microorganisms. This plasmid contains amultiple cloning site situated downstream of the sequence encoding theDNA-binding domain of GAL4 and upstream of a terminator to form a fusionprotein. It also contains the S. cerevisiae TRP1 gene which makes itpossible to complement yeasts of the trp1 genotype so as to select themon a minimum medium not containing tryptophan. This vector carries thegene for resistance to ampicillin which makes it possible to select thebacteria on a medium containing ampicillin.

pGBT-HaRasVal12: plasmid derived from pGBT9 and comprising the sequenceencoding the HaRas protein mutated at position Val12 known to interactwith the mammalian Raf protein. This plasmid was used to test thespecificity of interaction of the protein according to the inventionwith human topoisomerase IIIα.

PGBT-Fe65: plasmid derived from pGBT9 and comprising a portion of thesequence encoding the Fe65 protein known to interact with the cytoplamicregion of APP (Amyloid Peptide Precursor). This plasmid was used as acontrol to verify the specificity of interaction of the proteinaccording to the invention with human topoisomerase IIIα

The vector pGAD GH: provided by Clontech and which allows the expressionin yeast of proteins from the fusion between the transactivating domainof GAL4 and a protein encoded by the cDNA obtained from a Hela celllibrary, inserted at the level of the EcoRI and XhoI sites.

The vector pLex9 (pBTM116) (Bartel et al D. A. Hartley Ed, OxfordUniversity press page 153) of 5 kb homologous to pGBT10 which contains amultiple cloning site downstream of the sequence encoding the bacterialLexA repressor and upstream of a terminator to form a fusion protein.

4) The Synthetic Oligonucleotides Used are

SEQ ID No. 9 oligonucleotide 124

CGAGGTCTGAGGATGATCTT

SEQ ID No. 10 oligonucleotide 125

CTGAGAAAGTGGCGTTCTCT

This pair of oligonucleotides served to amplify by PCR, starting with aHela cell cDNA library, a fragment corresponding to the sequenceencoding human topoisomerase IIIα.

SEQ ID No. 11 oligonucleotide Top3Xho1

AAGTTACTCGAGATGGCCCTCCGAGG

SEQ ID No. 12: oligonucleotide Top3Hind3

ACGAGCAAGCTTCTCTACCCTACCCTG

The pair of oligonucleotides Top3Xho1 and Top3Hind3 made it possible tointroduce the XhoI and HindIII sites respectively during a second PCRstep on the fragment corresponding to topoisomeraseIIIα previouslyamplified by means of oligonucleotides 124 and 125.

SEQ ID No. 13: oligonucleotide PCS1

AATTGCGAATTCTCGAGCCCGGGGATCCGTCGACTGCA

SEQ ID No. 14: oligonucleotide PCS2

GTCGCAGGATCCCCGGGCTCGAGAATTCGC

The pair of oligonucleotides PCS1 and PCS2 made it possible to introduceto the plasmid pLex9 a XhoI site in phase with the humantopoisomeraseIIIα coding sequence. The insert comprising the geneencoding topoisomerase IIIα was therefore recloned into this vectorbetween the sites XhoI in 5′ and Sal I in 3′.

SEQ ID No. 15 oligonucleotide GAL4TA

CCACTACAATGGATGATG

This oligonucleotide was used to sequence the inserts contained in theplasmids of the Hela cell two-hybrid cDNA library.

The oligonucleotides are synthesized on the Applied System ABI 394-08apparatus. They are detached from the synthesis template with ammoniaand precipitated twice with 10 volumes of n-butanol and then taken up inwater. The quantification is carried out by measuring the opticaldensity (one OD unit corresponds to 30 μg/ml).

5) Transformation of the TG1 Bacteria

The entire ligation volume (10 μl) is used to transform the TG1 bacteriamade competent by the Chung et al. method (PNAS, 1988 86, 2172-2175).

The TG1 bacteria are cultured in a liquid LB medium for a few hours in ashaking incubator at 37° C., until an OD of 0.6 to 600 nm is obtained.The medium is then centrifuged at 6000 rpm for 10 min. The bacteria aremade competent by taking up the bacterial pellet in a volume of TSB (LBmedium+100 g/l of PEG 4000, 5% of DMSO, 10 mM MgCl₂, 10 mM MgSO₄)corresponding to {fraction (1/10)} of the volume of the initial culturemedium. After incubation at 4° C. for 30 to 60 minutes, 200 μl ofbacteria are brought into contact with the ligation products for 15minutes on ice. After addition of 200 μl of LB, the bacteria areincubated for 30 min at 37° C. and then plated on an LB+ampicillinmedium.

6) Preparation of Plasmids From the Hela Cell two-Hybrid cDNA Library(Clontech®)

The Hela cell two-hybrid cDNA library is sold in the form of bacteria.The latter contain a plasmid PGAD GH containing an insert correspondingto a Hela cell cDNA. The cDNAs of this library are constituted by meansof an oligodT primer. These cDNAs are cloned in an orientated mannerinto the vector pGAD GH at the level of the EcoRI and XhoI. 2.1 sites)

The plasmid DNA of the brain cDNA library was extracted according to theClontech® protocol. To preserve the representativeness of the librarywhich consists of 1.2×10⁶ independent plasmids, the batch of plasmid DNAwas prepared from a number of isolated bacterial colonies correspondingto a little over twice the representativeness of the library, that is4×10⁶ colonies.

After verification of the titre of the library, 2 μl of bacteria of theHela cell two-hybrid cDNA library, previously placed in 8 ml of LB, areplated on a solid medium (16 dishes/770 cm² in LB+ampicillin medium).The colonies which appear are taken up for each of the dishes in 30 mlof liquid LB+ampicillin. The suspensions obtained are incubated at 37°C. for 3 hours. The DNA is then extracted from these strains by thetechnique for extracting plasmid DNA in a large quantity. The DNAconcentration is determined at 260 nm.

7) Transformation of Yeast

The yeasts previously cultured in 100 ml of liquid medium are harvestedby centrifugation (3000 rpm, 3 minutes). The pellet is washed twice bycentrifuging with 1 ml of sterile water. The yeasts are then taken up in1 ml of transformation solution I (0.1 M LiAc, 10 mM Tris-HCl pH 7.5, 1mM EDTA) and then centrifuged (3000 rpm, 3 minutes). The pellet is takenup in 1 ml of transformation solution I. 50 μl of this yeast suspensionare brought into contact with 50 μg of salmon sperm DNA and 1 to 5 μg ofplasmid DNA and 300 μl of a transformation solution II (0.1 M LiAc, 10mM Tris-HCl pH 7.5, 1 mM EDTA in 40% PEG₄₀₀₀). This mixture is incubatedat 28° C. for 30 minutes. After application of a heat shock (40° C., 15minutes), the cells are harvested by centrifugation (15000 rpm for 1min). This pellet is taken up in 200 μl of water and then plated on aminimum agar medium not containing amino acids corresponding to theresistance markers carried by the plasmids transforming the yeasts. Theyeasts are incubated for 72 hours at 28° C.

8) Transformation of Yeast With the Hela Cell two-Hybrid cDNA Library

The yeast used was transformed beforehand with the plasmid pLexTopoIIIα.It is cultured in minimum YNB+His+Lys+Ad+Leu medium (250 ml), at 28° C.,with stirring until a density of 10⁷ cells/ml is obtained. The cells areharvested by centrifugation (3000 rpm, 10 minutes) and then taken up in250 ml of water. After another centrifugation, the cellular pellet istaken up in 100 ml of water and again centrifuged. The pellet is thentaken up in 10 ml of transformation solution I and incubated for 1 hourat 28° C. with stirring. After centrifugation, the cells are again takenup in 2.5 ml of transformation solution I, 100 μl of the Hela cell cDNAlibrary and 20 ml of transformation solution II, and then incubated for1 hour at 28° C. with stirring. A heat shock is applied to thistransformation mixture at 42° C. for 20 minutes. The cells are thencentrifuged and the cellular pellet harvested is washed with 10 ml ofsterile water. This operation is repeated twice and then the pellet istaken up in 2.5 ml of PBS. At this stage, the PEG which is toxic to thecells is removed. 2.4 ml of this suspension are used to inoculate 250 mlof minimum medium containing the amino acids His, Lys, Ad and culturedovernight in a shaker at 28° C. The remaining 100 μl of this suspensionserve to determine the transformation efficiency by dilution on solidminimum medium in the presence of His, Lys and Ad. The overnight cultureis then centrifuged (3000 rpm for 5 min) and washed twice with sterilewater. The pellet is then taken up in 2.5 ml of water. One aliquot of2.4 ml of this mixture is brought to 10 ml in sterile water, thissolution is used to inoculate 10 dishes of 435 cm² containing 200 ml ofYNB+Lys+Ad medium and incubated for 3 days. The remaining 100 μl areused to determine the level of amplification of the number of coloniesduring an overnight culture.

9) Extraction of Nucleic Acids From Yeasts

The value of an average loop of a yeast clone is placed in 200 μl of aTELT solution (2% Triton X100, 1% SDS, 100 mM NaCl, 10 mM Tris pH 8, 1mM EDTA), in the presence of 3 g of glass beads 450 μm in diameter and200 μl of phenol/chloroform. This mixture is stirred for 15 minutes andthen centrifuged for 2 minutes at 14000 rpm. The supernatant iscollected without removing the protein cake and the DNA contained inthis phase is precipitated with 2.5 volumes of absolute ethanol. Aftercentrifuging for 2 minutes at 14000 rpm, the DNA pellet is dried andtaken up in 20 μl of TE-RNAse. 3 μl of this DNA solution previouslydialyzed against water, which corresponds to a mixture of nucleic acids,serves directly to transform HB101 bacteria. Only the plasmid DNA iscapable of replicating in the bacteria and may be analyzed by thetechnique for preparing plasmid DNA from bacteria in a small quantity.

10) Test for β-galactosidase Activity

A nitrocellulose sheet is deposited beforehand on the Petri dishcontaining the individualized yeast clones. This sheet is then immersedin liquid nitrogen for 30 seconds so as to break the yeasts and thusrelease the β-galactosidase activity. After thawing, the nitrocellulosesheet is deposited, colonies at the top, in another Petri dishcontaining a Whatman 3M paper impregnated beforehand with 1.5 ml of PBSsolution (60 mM Na₂HPO4, 40 mM NaH₂PO₄, 10 mM KCl, 1 mM MgSO₄, pH 7) and10 to 30 μl of X-Gal (5-bromo-4-chloro-3-indoyl-β-D-galactoside)containing 50 mg/ml of N,N-dimethylformamide. The dish is then incubatedat 37° C.

EXAMPLE 1 Construction of a Vector Allowing the Expression of a ProteinFrom the Fusion Between Human Topoisomerase IIIα and a DNA-bindingProtein

The screening of a cDNA library using the two-hybrid system requiresbeforehand that the human topoisomerase IIIα is fused with a proteincapable of binding to the promoters controlling the expression ofreporter genes such as the LexA protein of the bacterial repressor orthe DNA-binding domain (DB) of GAL4. The expression of the fusionproteins is carried out by means of the vector pLex9 in the case of afusion with the LexA protein or by means of the vector pGBT9 (+2) for afusion with the DB of GAL4 (cf. Materials and Methods). The sequenceencoding the human topoisomerase IIIα presented in SEQ ID No. 1 wasintroduced into these two types of vector in the same reading frame asthe sequence corresponding to the LexA protein or to the DB of Gal4.

The DNA fragment corresponding to the sequence encoding humantopoisomerase IIIα was amplified by PCR from a Hela cell cDNA library(Clontech) by means of oligonucleotides 124 and 125. A second PCRamplification step was performed on the DNA fragment so as to introduceat the two ends the XhoI and HindIII sites by means of the pair ofoligonucleotides Top3Xho1 and Top3Hind3. The new DNA fragment obtained,digested with XhoI and HindIII, was introduced at the correspondingsites into the vector pBlueBacHis2A (Invitrogen) which gives thepossibility of using new BamHI and SalI restriction sites (representedin bold with the XhoI and HindIII sites in FIG. 1) so as to produce thefinal constructs.

The plasmid pLex-TopoIIIα was constructed by inserting the XhoI-SalIfragment, of the preceding plasmid, corresponding to humantopoisomeraseIIIα, into the plasmid pLex9 modified beforehand byinsertion of the oligonucleotides PCS1 and PCS2 at the EcoRI-PstI1sites. This plasmid was used to screen a Hela cell two-hybrid cDNAlibrary with the aim of identifying proteins interacting with humantopoisomerase IIIα.

The plasmid pGBT-TopoIIIα was constructed by inserting, at the BamHI andSalI sites of the plasmid pGBT9 (+2), a fragment obtained by partialdigestion with BamHI and total digestion with SalI and corresponding tohuman topoisomerase IIIα. This plasmid was used to validate, by thetwo-hybrid technique, the specificity of interaction of the proteinsselected during the screening with human topoisomerase IIIα.

The constructs were verified by sequencing the DNA. This verificationmade it possible to show that the fragments of human topoisomerase IIIαdid not exhibit mutations generated during the PCR reaction and thatthey were fused in the same open reading frame as that of the fragmentscorresponding to the LexA protein or to the DB of GAL4.

EXAMPLE 2 Screening by the two-Hybrid Technique of a HeLa Cell cDNALibrary

The screening of a fusion library makes it possible to identify clonesproducing proteins fused with the transactivating domain of GAL4, whichcan interact with topoisomerase IIIα. This interaction makes it possibleto reconstitute a transactivator which will then be capable of inducingthe expression of the reporter genes His3 and LacZ in the L40 strainused.

To carry out this screening, a fusion library produced from cDNAobtained from Hela cells was chosen.

Transformation of Yeast With the Hela Cell two-Hybrid cDNA Library andSelection of the Positive Clones

During the screening, it is necessary to preserve the probability thateach independent plasmid of the fusion library is present in at leastone yeast at the same time as the plasmid pLex-TopoIIIα. To preservethis probability, it is important to have a good efficiency oftransformation of the yeast; for this purpose, a yeast transformationprotocol giving an efficiency of 10⁵ transformed cells per μg of DNA waschosen. Furthermore, as the cotransformation of yeast with two differentplasmids reduces this efficiency, an L40 yeast transformed beforehandwith the plasmid pLex-TopoIIIα was used. This strain containingpLex-TopoIIIα, of the phenotype His-, Lys-, Leu-, was transformed with100 μg of plasmid DNA the two-hybrid library. This quantity of DNA madeit possible to obtain after estimation (see Materials and Methods) 6×10⁶transformed cells, which corresponds to the number of independentplasmids which the library constitutes. It is thus possible to estimatethat less than all of the plasmids of the library served to transformthe yeasts. The selection of the transformed cells, capable ofreconstituting a functional GAL4 transactivator, was performed on anYNB+Lys+Ad medium.

At the end of this selection, about 500 clones of the His+ phenotypewere obtained. A test for β-galactosidase activity was performed onthese transformants so as to determine the number of clones expressingthe other reporter gene, LacZ. Of the 500 clones obtained, sixty-threeexhibited the double phenotype His+ and βGal+, thus showing that theyexpress proteins which can interact with human topoisomerase IIIα.

EXAMPLE 3 Isolation of the Plasmids From the Yeast Clones Selected

To identify the proteins which interact with human topoisomerase IIIα,the plasmids obtained from the two-hybrid library of the yeasts selectedduring the two-hybrid screening were extracted. The DNA of the yeaststrains of the phenotype His+ and βGal+ is used to transform the E. coliHB101 strain.

The plasmid DNAs of the bacterial colonies obtained after transformationwith yeast DNA extracts were analyzed by digesting with restrictionenzymes and separating the DNA fragments on agarose gel. Two differentrestriction profiles were obtained out of 15 yeast clones analyzed. Oneof these profiles was highly represented. These results show that atleast 2 different plasmids were isolated during this screening, the DNAfragment obtained from the cDNA library contained in the most highlyrepresented plasmid was selected for the remainder of the study.

EXAMPLE 4 Determination of the Sequence of the Insert Contained in thePlasmid Selected

The sequencing was carried out on the most highly represented plasmid.The sequencing is carried out using the oligonucleotide GAL4TAcomplementary to the region close to the site of insertion of the Helacell cDNA library, at 52 base pairs from the EcoRI site.

Comparison of the sequence obtained with the sequences contained in theGenBank and EMBL (European Molecular Biology Lab) databanks has shownthat the sequence of the cDNA present in the plasmid selected exhibits98.2% at the nucleic level with the human gene encoding the Dead Box Xisoform protein (DBX1) also called helicase like protein 2 (DDX14)having the accession number AF000982 and U50553 respectively. Comparisonof the sequence of the cDNA present in the plasmid selected also shows98.1% identity with the DDX14 protein.

The nucleotide and polypeptide sequence of DBX1 is presented in thesequence SEQ ID No. 5. The sequence of the gene cloned by two hybridsstarts at nucleotide 952 relative to the putative initiation codon, thatis at the 318th amino acid and contains a sequence homologous to thesequence encoding the C-terminal part of the DBX1 protein including thestop codon.

This result shows that the domain for interaction of the protein orpolypeptide partners of human topoisomerase IIIα is contained in thesecond C-terminal half of said partners.

Differences were noted relative to the published DBX1 sequence, inparticular the AGT codon (at position 1768 relative to the initiationcodon, that is at position 2624 on the sequence SEQ ID No. 5) encodingserine 590 is absent in the cloned fragment.

Likewise, the presence of a C residue in place of a T at position 2068of the ATG was noted.

The sequence of the cloned fragment is represented in SEQ ID No. 3.

EXAMPLE 5 Analysis of the Specificity of Interaction BetweenTopoisomerase IIIα and the Polypeptides of the Invention

The specificity of interaction between human topoisomerase IIIα and thepolypeptide according to the invention was confirmed in a two-hybridinteraction test using the plasmid pGBT-TopoIIIα in place of the plasmidpLex-TopoIIIα. The plasmid pGBT-TopoIIIα comprises the gene encodinghuman topoisomerase IIIα fused with the DNA-binding domain of GAL4.

The strain yCM17 was transformed with the plasmid isolated during thescreening of the two-hybrid library and with the plasmid pGBT-TopoIIIα.Controls for specificity of interaction were also performed bytransforming this strain with the control plasmids pGBT-HaRasVal12 orpGBT-Fe65, in place of the plasmid pGBT-TopoIIIα. A test of β-Galactivity on the cells transformed with the various plasmids wasperformed to demonstrate the protein-protein interactions.

The results of the test showed that only the yeasts transformed with theplasmid isolated during the screening of the two-hybrid library and withthe plasmid pGBT-TopoIIIα exhibited a β-Gal+ activity, thus showinginteraction between human topoisomerase IIIα and the C-terminal regionof the polypeptides according to the invention. These results also showthat this interaction is independent of the fusion protein used.

15 1 2973 DNA Homo sapiens 1 ggatccgagc tcgagatggc cctccgaggc gtgcggaaagtcctctgtgt ggccgaaaaa 60 aacgacgcgg ccaaggggat cgccgacctg ctgtcaaacggtcgcatgag gcggagagaa 120 ggactttcaa aattcaacaa gatctatgaa tttgattatcatctgtatgg ccagaatgtt 180 accatggtaa tgacttcagt ttctggacat ttactggctcatgatttcca gatgcagttt 240 cgaaaatggc agagctgcaa ccctcttgtc ctctttgaagcagaaattga aaagtactgc 300 ccagagaatt ttgtagacat caagaaaact ttggaacgagagactcgcca gtgccaggct 360 ctggtgatct ggactgactg tgatagagaa ggcgaaaacatcgggtttga gattatccac 420 gtgtgtaagg ctgtaaagcc caatctgcag gtgttgcgagcccgattctc tgagatcaca 480 ccccatgccg tcaggacagc ttgtgaaaac ctgaccgagcctgatcagag ggtgagcgat 540 gctgtggatg tgaggcagga gctggacctg aggattggagctgcctttac taggttccag 600 accctgcggc ttcagaggat ttttcctgag gtgctggcagagcagctcat cagttacggc 660 agctgccagt tccccacact gggctttgtg gtggagcggttcaaagccat tcaggctttt 720 gtaccagaaa tcttccacag aattaaagta actcatgaccacaaagatgg tatcgtagaa 780 ttcaactgga aaaggcatcg actctttaac cacacggcttgcctagttct ctatcagttg 840 tgtgtggagg atcccatggc aactgtggta gaggtcagatctaagcccaa gagcaagtgg 900 cggcctcaag ccttggacac tgtggagctt gagaagctggcttctcgaaa gttgagaata 960 aatgctaaag aaaccatgag gattgctgag aagctctacactcaagggta catcagctat 1020 ccccgaacag aaacaaacat ttttcccaga gacttaaacctgacggtgtt ggtggaacag 1080 cagacccccg atccacgctg gggggccttt gcccagagcattctagagcg gggtggtccc 1140 accccacgca atgggaacaa gtctgaccaa gctcaccctcccattcaccc caccaaatac 1200 accaacaact tacagggaga tgaacagcga ctgtacgagtttattgttcg ccatttcctg 1260 gcttgctgct cccaggatgc tcaggggcag gagaccacagtggagatcga catcgctcag 1320 gaacgctttg tggcccatgg cctcatgatt ctggcccgaaactatctgga tgtgtatcca 1380 tatgatcact ggagtgacaa gatcctccct gtctatgagcaaggatccca ctttcagccc 1440 agcaccgtgg agatggtgga cggggagacc agcccacccaagctgctcac cgaggccgac 1500 ctcattgccc tcatggagaa gcatggcatt ggtacggatgccactcatgc ggagcacatc 1560 gagaccatca aagcccggat gtacgtgggc ctcaccccagacaagcggtt cctccctggg 1620 cacctgggca tgggacttgt ggaaggttat gattccatgggctatgaaat gtctaagcct 1680 gacctccggg ctgaactgga agctgatctg aagctgatctgtgatggcaa aaaggacaaa 1740 tttgtggttc taaggcagca agtgcagaaa tacaagcaggttttcattga agcggtggct 1800 aaagcaaaga aattggacga ggccttggcc cagtactttgggaatgggac agagttggcc 1860 cagcaagaag atatctaccc agccatgcca gagcccatcaggaagtgccc acagtgcaac 1920 aaggacatgg tccttaagac caagaagaat ggcgggttctacctcagctg catgggtttc 1980 ccagagtgtc gctcagctgt gtggcttcct gactcggtgctggaggccag cagggacagc 2040 agtgtgtgtc cagtttgtca gccacaccct gtgtacaggttaaagttaaa gtttaagcgc 2100 ggtagccttc ccccgaccat gcctctggag tttgtttgctgcatcggcgg atgcgacgac 2160 accctgaggg agatcctgga cctgagattt tcagggggcccccccagggc tagccagccc 2220 tctggccgcc tgcaggctaa ccagtccctg aacaggatggacaacagcca gcacccccag 2280 cctgctgaca gcagacagac tgggtcctca aaggctctggcccagaccct cccaccaccc 2340 acggctgctg gtgaaagcaa ttctgtgacc tgcaactgtggccaggaggc tgtgctgctc 2400 actgtccgta aggagggccc caaccggggc cggcagttctttaagtgcaa cggaggtagc 2460 tgcaacttct tcctgtgggc agacagcccc aatccgggagcaggagggcc tcctgccttg 2520 gcatatagac ccctgggcgc ctccctggga tgcccaccaggcccagggat ccacctaggt 2580 gggtttggca accctggtga tggcagtggt agtggcacatcctgcctttg cagccagccc 2640 tccgtcacac ggactgtgca gaaggatgga cccaacaaggggcgccagtt ccacacatgt 2700 gccaagccga gagagcagca gtgtggcttt ttccagtgggtcgatgagaa caccgctcca 2760 gggacttctg gagccccgtc ctggacagga gacagaggaagaaccctgga gtcggaagcc 2820 agaagcaaaa ggccccgggc cagttcctca gacatggggtccacagcaaa gaaaccccgg 2880 aaatgcagcc tttgccacca gcctggacac acccgtcccttttgtcctca gaacagatga 2940 gctcagggta gggtagagaa gcttggagtc gac 2973 2974 PRT Homo sapiens 2 Met Ala Leu Arg Gly Val Arg Lys Val Leu Cys ValAla Glu Lys Asn 1 5 10 15 Asp Ala Ala Lys Gly Ile Ala Asp Leu Leu SerAsn Gly Arg Met Arg 20 25 30 Arg Arg Glu Gly Leu Ser Lys Phe Asn Lys IleTyr Glu Phe Asp Tyr 35 40 45 His Leu Tyr Gly Gln Asn Val Thr Met Val MetThr Ser Val Ser Gly 50 55 60 His Leu Leu Ala His Asp Phe Gln Met Gln PheArg Lys Trp Gln Ser 65 70 75 80 Cys Asn Pro Leu Val Leu Phe Glu Ala GluIle Glu Lys Tyr Cys Pro 85 90 95 Glu Asn Phe Val Asp Ile Lys Lys Thr LeuGlu Arg Glu Thr Arg Gln 100 105 110 Cys Gln Ala Leu Val Ile Trp Thr AspCys Asp Arg Glu Gly Glu Asn 115 120 125 Ile Gly Phe Glu Ile Ile His ValCys Lys Ala Val Lys Pro Asn Leu 130 135 140 Gln Val Leu Arg Ala Arg PheSer Glu Ile Thr Pro His Ala Val Arg 145 150 155 160 Thr Ala Cys Glu AsnLeu Thr Glu Pro Asp Gln Arg Val Ser Asp Ala 165 170 175 Val Asp Val ArgGln Glu Leu Asp Leu Arg Ile Gly Ala Ala Phe Thr 180 185 190 Arg Phe GlnThr Leu Arg Leu Gln Arg Ile Phe Pro Glu Val Leu Ala 195 200 205 Glu GlnLeu Ile Ser Tyr Gly Ser Cys Gln Phe Pro Thr Leu Gly Phe 210 215 220 ValVal Glu Arg Phe Lys Ala Ile Gln Ala Phe Val Pro Glu Ile Phe 225 230 235240 His Arg Ile Lys Val Thr His Asp His Lys Asp Gly Ile Val Glu Phe 245250 255 Asn Trp Lys Arg His Arg Leu Phe Asn His Thr Ala Cys Leu Val Leu260 265 270 Tyr Gln Leu Cys Val Glu Asp Pro Met Ala Thr Val Val Glu ValArg 275 280 285 Ser Lys Pro Lys Ser Lys Trp Arg Pro Gln Ala Leu Asp ThrVal Glu 290 295 300 Leu Glu Lys Leu Ala Ser Arg Lys Leu Arg Ile Asn AlaLys Glu Thr 305 310 315 320 Met Arg Ile Ala Glu Lys Leu Tyr Thr Gln GlyTyr Ile Ser Tyr Pro 325 330 335 Arg Thr Glu Thr Asn Ile Phe Pro Arg AspLeu Asn Leu Thr Val Leu 340 345 350 Val Glu Gln Gln Thr Pro Asp Pro ArgTrp Gly Ala Phe Ala Gln Ser 355 360 365 Ile Leu Glu Arg Gly Gly Pro ThrPro Arg Asn Gly Asn Lys Ser Asp 370 375 380 Gln Ala His Pro Pro Ile HisPro Thr Lys Tyr Thr Asn Asn Leu Gln 385 390 395 400 Gly Asp Glu Gln ArgLeu Tyr Glu Phe Ile Val Arg His Phe Leu Ala 405 410 415 Cys Cys Ser GlnAsp Ala Gln Gly Gln Glu Thr Thr Val Glu Ile Asp 420 425 430 Ile Ala GlnGlu Arg Phe Val Ala His Gly Leu Met Ile Leu Ala Arg 435 440 445 Asn TyrLeu Asp Val Tyr Pro Tyr Asp His Trp Ser Asp Lys Ile Leu 450 455 460 ProVal Tyr Glu Gln Gly Ser His Phe Gln Pro Ser Thr Val Glu Met 465 470 475480 Val Asp Gly Glu Thr Ser Pro Pro Lys Leu Leu Thr Glu Ala Asp Leu 485490 495 Ile Ala Leu Met Glu Lys His Gly Ile Gly Thr Asp Ala Thr His Ala500 505 510 Glu His Ile Glu Thr Ile Lys Ala Arg Met Tyr Val Gly Leu ThrPro 515 520 525 Asp Lys Arg Phe Leu Pro Gly His Leu Gly Met Gly Leu ValGlu Gly 530 535 540 Tyr Asp Ser Met Gly Tyr Glu Met Ser Lys Pro Asp LeuArg Ala Glu 545 550 555 560 Leu Glu Ala Asp Leu Lys Leu Ile Cys Asp GlyLys Lys Asp Lys Phe 565 570 575 Val Val Leu Arg Gln Gln Val Gln Lys TyrLys Gln Val Phe Ile Glu 580 585 590 Ala Val Ala Lys Ala Lys Lys Leu AspGlu Ala Leu Ala Gln Tyr Phe 595 600 605 Gly Asn Gly Thr Glu Leu Ala GlnGln Glu Asp Ile Tyr Pro Ala Met 610 615 620 Pro Glu Pro Ile Arg Lys CysPro Gln Cys Asn Lys Asp Met Val Leu 625 630 635 640 Lys Thr Lys Lys AsnGly Gly Phe Tyr Leu Ser Cys Met Gly Phe Pro 645 650 655 Glu Cys Arg SerAla Val Trp Leu Pro Asp Ser Val Leu Glu Ala Ser 660 665 670 Arg Asp SerSer Val Cys Pro Val Cys Gln Pro His Pro Val Tyr Arg 675 680 685 Leu LysLeu Lys Phe Lys Arg Gly Ser Leu Pro Pro Thr Met Pro Leu 690 695 700 GluPhe Val Cys Cys Ile Gly Gly Cys Asp Asp Thr Leu Arg Glu Ile 705 710 715720 Leu Asp Leu Arg Phe Ser Gly Gly Pro Pro Arg Ala Ser Gln Pro Ser 725730 735 Gly Arg Leu Gln Ala Asn Gln Ser Leu Asn Arg Met Asp Asn Ser Gln740 745 750 His Pro Gln Pro Ala Asp Ser Arg Gln Thr Gly Ser Ser Lys AlaLeu 755 760 765 Ala Gln Thr Leu Pro Pro Pro Thr Ala Ala Gly Glu Ser AsnSer Val 770 775 780 Thr Cys Asn Cys Gly Gln Glu Ala Val Leu Leu Thr ValArg Lys Glu 785 790 795 800 Gly Pro Asn Arg Gly Arg Gln Phe Phe Lys CysAsn Gly Gly Ser Cys 805 810 815 Asn Phe Phe Leu Trp Ala Asp Ser Pro AsnPro Gly Ala Gly Gly Pro 820 825 830 Pro Ala Leu Ala Tyr Arg Pro Leu GlyAla Ser Leu Gly Cys Pro Pro 835 840 845 Gly Pro Gly Ile His Leu Gly GlyPhe Gly Asn Pro Gly Asp Gly Ser 850 855 860 Gly Ser Gly Thr Ser Cys LeuCys Ser Gln Pro Ser Val Thr Arg Thr 865 870 875 880 Val Gln Lys Asp GlyPro Asn Lys Gly Arg Gln Phe His Thr Cys Ala 885 890 895 Lys Pro Arg GluGln Gln Cys Gly Phe Phe Gln Trp Val Asp Glu Asn 900 905 910 Thr Ala ProGly Thr Ser Gly Ala Pro Ser Trp Thr Gly Asp Arg Gly 915 920 925 Arg ThrLeu Glu Ser Glu Ala Arg Ser Lys Arg Pro Arg Ala Ser Ser 930 935 940 SerAsp Met Gly Ser Thr Ala Lys Lys Pro Arg Lys Cys Ser Leu Cys 945 950 955960 His Gln Pro Gly His Thr Arg Pro Phe Cys Pro Gln Asn Arg 965 970 31233 DNA Homo sapiens 3 catttgttag tagccactcc aggacgtcta gtggatatgatggaaagagg aaagattgga 60 ttagactttt gcaaatactt ggtgttagat gaagctgatcggatgttgga tatggggttt 120 gagcctcaga ttcgtagaat agtcgaacaa gatactatgcctccaaaggg tgtccgccac 180 actatgatgt ttagtgctac ttttcctaag gaaatacagatgctggctcg tgatttctta 240 gatgaatata tcttcttggc tgtaggaaga gttggctctacctctgaaaa catcacacag 300 aaagtagttt gggtggaaga atcagacaaa cggtcatttctgcttgacct cctaaatgca 360 acaggcaagg attcactgac cttagtgttt gtggagaccaaaaagggtgc agattctctg 420 gaggatttct tataccatga aggatacgca tgtaccagcatccatggaga ccgttctcag 480 agggatagag aagaggccct tcaccagttc cgctcaggaaaaagcccaat tttagtggct 540 acagcagtag cagcaagagg actggacatt tcaaatgtgaaacatgttat caattttgac 600 ttgccaagtg atattgaaga atatgtacat cgtattggtcgtacgggacg tgtaggaaac 660 cttggcctgg caacctcatt ctttaacgag aggaacataaatattactaa ggatttgttg 720 gatcttcttg ttgaagctaa acaagaagtg ccgtcttggttagaaaacat ggcttatgaa 780 caccactaca agggtagcag tcgtggacgt tctaagagcagatttagtgg agggtttggt 840 gccagagact accgacaaag tagcggtgcc agcagttccagcttcagcag cagccgcgca 900 agcagcagcc gcagtggcgg aggtggccac ggtagcagcagaggatttgg tggaggtggc 960 tatggaggct tttacaacag tgatggatat ggaggaaattataactccca gggggttgac 1020 tggtggggta actgagcctg ctttgcagta ggtcaccctgccaaacaagc taatatggaa 1080 accacatgta acttagccag actatacctt gtgtagcttcaagaactcgc agtacattac 1140 cagctgtgat tctccactga aatttttttt ttaagggagctcaaggtcac aagaagaaat 1200 gaaaggaaca atcagcagcc ctgttcagaa gga 1233 4344 PRT Homo sapiens 4 His Leu Leu Val Ala Thr Pro Gly Arg Leu Val AspMet Met Glu Arg 1 5 10 15 Gly Lys Ile Gly Leu Asp Phe Cys Lys Tyr LeuVal Leu Asp Glu Ala 20 25 30 Asp Arg Met Leu Asp Met Gly Phe Glu Pro GlnIle Arg Arg Ile Val 35 40 45 Glu Gln Asp Thr Met Pro Pro Lys Gly Val ArgHis Thr Met Met Phe 50 55 60 Ser Ala Thr Phe Pro Lys Glu Ile Gln Met LeuAla Arg Asp Phe Leu 65 70 75 80 Asp Glu Tyr Ile Phe Leu Ala Val Gly ArgVal Gly Ser Thr Ser Glu 85 90 95 Asn Ile Thr Gln Lys Val Val Trp Val GluGlu Ser Asp Lys Arg Ser 100 105 110 Phe Leu Leu Asp Leu Leu Asn Ala ThrGly Lys Asp Ser Leu Thr Leu 115 120 125 Val Phe Val Glu Thr Lys Lys GlyAla Asp Ser Leu Glu Asp Phe Leu 130 135 140 Tyr His Glu Gly Tyr Ala CysThr Ser Ile His Gly Asp Arg Ser Gln 145 150 155 160 Arg Asp Arg Glu GluAla Leu His Gln Phe Arg Ser Gly Lys Ser Pro 165 170 175 Ile Leu Val AlaThr Ala Val Ala Ala Arg Gly Leu Asp Ile Ser Asn 180 185 190 Val Lys HisVal Ile Asn Phe Asp Leu Pro Ser Asp Ile Glu Glu Tyr 195 200 205 Val HisArg Ile Gly Arg Thr Gly Arg Val Gly Asn Leu Gly Leu Ala 210 215 220 ThrSer Phe Phe Asn Glu Arg Asn Ile Asn Ile Thr Lys Asp Leu Leu 225 230 235240 Asp Leu Leu Val Glu Ala Lys Gln Glu Val Pro Ser Trp Leu Glu Asn 245250 255 Met Ala Tyr Glu His His Tyr Lys Gly Ser Ser Arg Gly Arg Ser Lys260 265 270 Ser Arg Phe Ser Gly Gly Phe Gly Ala Arg Asp Tyr Arg Gln SerSer 275 280 285 Gly Ala Ser Ser Ser Ser Phe Ser Ser Ser Arg Ala Ser SerSer Arg 290 295 300 Ser Gly Gly Gly Gly His Gly Ser Ser Arg Gly Phe GlyGly Gly Gly 305 310 315 320 Tyr Gly Gly Phe Tyr Asn Ser Asp Gly Tyr GlyGly Asn Tyr Asn Ser 325 330 335 Gln Gly Val Asp Trp Trp Gly Asn 340 55321 DNA Homo sapiens 5 tttcccctta ctccgctccc ctcttttccc tccctctcctccccttccct ctgttctctc 60 ctcctcttcc cctcccctcc cccgtccggg gcactctatattcaagccac cgtttcctgc 120 ttcacaaaat ggccaccgca cgcgacacct acggtcacgtggcctgccgc cctctcagtt 180 tcgggaatct gcctagctcc cactaagggg aggctacccgcggaagagcg agggcagatt 240 agaccggaga aatcccacca catctccaag cccgggaactgagagaggaa gaagagtgaa 300 ggccagtgtt aggaaaaaaa aaaacaaaaa caaaaaaaacgaaaaacgaa agctgagtgc 360 atagagttgg aaaggggagc gaatgcgtaa ggttggaaaggggggcgaag aggcctaggt 420 taacattttc aggcgtctta gccggtggaa agcgggagacgcaagttctc gcgagatctc 480 gagaactccg aggctgagac tagggtttta gcggagagcacgggaagtgt agctcgagag 540 aactgggaca gcatttcgca ccctaagctc caaggcaggactgctagggg cgacaggact 600 aagtaggaaa tcccttgagc ttagacctga gggagcgcgcagtagccggg cagaagtcgc 660 cgcgacaggg aattgcggtg tgagagggag ggcacacgttgtacgtgctg acgtagccgg 720 ctttccagcg ggtatattag atccgtggcc gcgcggtgcgctccagagcc gcagttctcc 780 cgtgagaggg ccttcgcggt ggaacaaaca ctcgcttagcagcggaagac tccgagttct 840 cggtactctt cagggatgag tcatgtggca gtggaaaatgcgctcgggct ggaccagcag 900 tttgctggcc tagacctgaa ctcttcagat aatcagagtggaggaagtac agccagcaaa 960 gggcgctata ttcctcctca tttaaggaac cgagaagctactagaggttt ctacgataaa 1020 gacagttcag ggtggagttc tagcaaagat aaggatgcgtatagcagttt tggatctcgt 1080 agtgattcaa gagggaagtc tagcttcttc agtgatcgtggaagtggatc aaggggaagg 1140 tttgatgatc gtggacggag tgattacgat ggcattggcagccgtggtga cagaagtggc 1200 tttggcaaat ttgaacgtgg tggaaacagt cgctggtgtgacaaatcaga tgaagatgat 1260 tggtcaaaac cactcccacc aagtgaacgc ttggaacaggaactcttttc tggaggcaac 1320 actgggatta attttgagaa atacgatgac attccagttgaggcaacagg caacaactgt 1380 cctccacata ttgaaagttt cagtgatgtt gagatgggagaaattatcat gggaaacatt 1440 gagcttactc gttatactcg cccaactcca gtgcaaaagcatgctattcc tattatcaaa 1500 gagaaaagag acttgatggc ttgtgcccaa acagggtctggaaaaactgc agcatttctg 1560 ttgcccatct tgagtcagat ttattcagat ggtccaggcgaggctttgag ggccatgaag 1620 gaaaatggaa ggtatgggcg ccgcaaacaa tacccaatctccttggtatt agcaccaacg 1680 agagagttgg cagtacagat ctacgaagaa gccagaaaattttcataccg atctagagtt 1740 cgtccttgcg tggtttatgg tggtgccgat attggtcagcagattcgaga cttggaacgt 1800 ggatgccatt tgttagtagc cactccagga cgtctagtggatatgatgga aagaggaaag 1860 attggattag acttttgcaa atacttggtg ttagatgaagctgatcggat gttggatatg 1920 gggtttgagc ctcagattcg tagaatagtc gaacaagatactatgcctcc aaagggtgtc 1980 cgccacacta tgatgtttag tgctactttt cctaaggaaatacagatgct ggctcgtgat 2040 ttcttagatg aatatatctt cttggctgta ggaagagttggctctacctc tgaaaacatc 2100 acacagaaag tagtttgggt ggaagaatca gacaaacggtcatttctgct tgacctccta 2160 aatgcaacag gcaaggattc actgacctta gtgtttgtggagaccaaaaa gggtgcagat 2220 tctctggagg atttcttata ccatgaagga tacgcatgtaccagcatcca tggagaccgt 2280 tctcagaggg atagagaaga ggcccttcac cagttccgctcaggaaaaag cccaatttta 2340 gtggctacag cagtagcagc aagaggactg gacatttcaaatgtgaaaca tgttatcaat 2400 tttgacttgc caagtgatat tgaagaatat gtacatcgtattggtcgtac gggacgtgta 2460 ggaaaccttg gcctggcaac ctcattcttt aacgagaggaacataaatat tactaaggat 2520 ttgttggatc ttcttgttga agctaaacaa gaagtgccgtcttggttaga aaacatggct 2580 tatgaacacc actacaaggg tagcagtcgt ggacgttctaagagtagcag atttagtgga 2640 gggtttggtg ccagagacta ccgacaaagt agcggtgccagcagttccag cttcagcagc 2700 agccgcgcaa gcagcagccg cagtggcgga ggtggccacggtagcagcag aggatttggt 2760 ggaggtggct atggaggctt ttacaacagt gatggatatggaggaaatta taactcccag 2820 ggggttgact ggtggggtaa ctgagcctgc tttgcagtaggtcaccctgc caaacaagct 2880 aatatggaaa ccacatgtaa cttagccaga ctataccttgtgtagtttca agaactcgca 2940 gtacattacc agctgtgatt ctccactgaa attttttttttaagggagct caaggtcaca 3000 agaagaaatg aaaggaacaa tcagcagccc tgttcagaaggtggtttgaa gacttcattg 3060 ctgtagtttg gattaactcc cctcccgcct acccccatcccaaactgcat ttataatttt 3120 gtgactgagg atcatttgtt tgttaatgta ctgtgcctttaactatagac aactttttat 3180 tttgatgtcc tgttggctca gtaatgctca agatatcaattgttttgaca aaataaattt 3240 actgaacttg ggctaaaatc aaaccttggc acacaggtgtgatacaactt aacaggaatc 3300 atcgattcat ccataaataa tataaggaaa aacttatgcggtagcctgca ttagggcttt 3360 ttgatacttg cagattgggg gaaaacaaca aatgtcttgaagcatattaa tggaattagt 3420 ttctaatgtg gcaaactgta ttaagttaaa gttctgatttgctcactcta tcctggatag 3480 gtatttagaa cctgatagtc tttaagccat tccagtcatgatgaggtgat gtatgaatac 3540 atgcatacat tcaaagcact gttttcaaag ttaatgcaagtaaatacagc aattcctctt 3600 tcaacgttta ggcagatcat taattatgag ctagccaaatgtgggcatac tattacaggg 3660 aaagtttaaa ggtctgataa cttgaaaata ggtttttaggagaattcatc tacttagact 3720 ttttaagtgc ctgccataaa tgaaattgaa atggtagaatggctgaccac agcaatgacc 3780 agccctcatt agggccctgg atgatttttg gtctaataacgcatgctagt gttgatgttt 3840 tttggtcaga gggtatgaac aggaagaatt aaatgcagcaggctttattt taaatgccga 3900 ttcacattac tctgttcaag ctgcgttgag atgttaaactggcttactat agacttcgta 3960 aaaatggctc cagaaaagta acaaactgaa atctttgagatcacacaggt tggaaatatg 4020 tacataactg cacaaggtgt caattctgct ctacagtgcagttttagtca gttttagttg 4080 cataggtttc cattgtattt atagtctgtt tatgctaaatctggccaaag atgaacattg 4140 tccaccacta aaatgcctct gccactttga attctgtgctaattttgtgg ccagaatgcg 4200 gtgatcaaaa cgctccatct ttttacagtg gcataggaagacggcaaaaa tttcctaaag 4260 tgcaatagat tttcaagtgt attgtgcctt gttctaaaacttttattaag taggtgcact 4320 tgacagtatt gaggtcattt gttatggtgc tatttcaattagtctaggtt taggcccttg 4380 tacattttgc ccataacttt ttacaaagta cttcttttattgcacattca gagaatttta 4440 tatatatgtc ttgtgtgcgt gtccttaaac ttccaatcttactttgtctc ttggagattg 4500 ttgaacgcag cttgtctagg aaggggatgg gactagattctaaaatttat ttgggaccat 4560 gggaatgata gttgggaaga aaactatttg cacacgacagatttctagat actttttgct 4620 gctagcttta tgtaatattt attgaacatt ttgacaaatatttatttttg taagcctaaa 4680 agtgattctt tgaaagttta aagaaacttg accaaaagacagtacaaaaa cactggcact 4740 tgaatgttga atgtcaccgt atgcgtgaaa ttatatatttcggggtagtg tgagctttta 4800 atgtttaagt catattaaac tcttaagtca aattaagcagacccggcgtt ggcagtgtag 4860 ccataacttt ctgatgttag taaaaacaaa attggcgacttgaaattaaa ttatgccaag 4920 gttttgatac acttgtctta agatattaat gaaacacttcaaaacactga tgtgaagtgt 4980 ccagattctc agatgtttgt tgtgtggatt ttgtttagttgtgtgttttt ttttttttca 5040 gtgaatgtct ggcacattgc aatcctcaaa catgtggttatctttgttgt attggcataa 5100 tcagtgactt gtacattcag caatagcatt tgagcaagttttatcagcaa gcaatatttt 5160 cagttaataa ggtttcaaaa atcatgtaag gatttaaacttgctgaatgt aaagattgaa 5220 cctcaagtca ctgtagcttt agtaattgct tattgtattagtttagatgc tagcactgca 5280 tgtgctgtgc atattctgat tttattaaaa taaaaaaaaa a5321 6 662 PRT Homo sapiens 6 Met Ser His Val Ala Val Glu Asn Ala LeuGly Leu Asp Gln Gln Phe 1 5 10 15 Ala Gly Leu Asp Leu Asn Ser Ser AspAsn Gln Ser Gly Gly Ser Thr 20 25 30 Ala Ser Lys Gly Arg Tyr Ile Pro ProHis Leu Arg Asn Arg Glu Ala 35 40 45 Thr Arg Gly Phe Tyr Asp Lys Asp SerSer Gly Trp Ser Ser Ser Lys 50 55 60 Asp Lys Asp Ala Tyr Ser Ser Phe GlySer Arg Ser Asp Ser Arg Gly 65 70 75 80 Lys Ser Ser Phe Phe Ser Asp ArgGly Ser Gly Ser Arg Gly Arg Phe 85 90 95 Asp Asp Arg Gly Arg Ser Asp TyrAsp Gly Ile Gly Ser Arg Gly Asp 100 105 110 Arg Ser Gly Phe Gly Lys PheGlu Arg Gly Gly Asn Ser Arg Trp Cys 115 120 125 Asp Lys Ser Asp Glu AspAsp Trp Ser Lys Pro Leu Pro Pro Ser Glu 130 135 140 Arg Leu Glu Gln GluLeu Phe Ser Gly Gly Asn Thr Gly Ile Asn Phe 145 150 155 160 Glu Lys TyrAsp Asp Ile Pro Val Glu Ala Thr Gly Asn Asn Cys Pro 165 170 175 Pro HisIle Glu Ser Phe Ser Asp Val Glu Met Gly Glu Ile Ile Met 180 185 190 GlyAsn Ile Glu Leu Thr Arg Tyr Thr Arg Pro Thr Pro Val Gln Lys 195 200 205His Ala Ile Pro Ile Ile Lys Glu Lys Arg Asp Leu Met Ala Cys Ala 210 215220 Gln Thr Gly Ser Gly Lys Thr Ala Ala Phe Leu Leu Pro Ile Leu Ser 225230 235 240 Gln Ile Tyr Ser Asp Gly Pro Gly Glu Ala Leu Arg Ala Met LysGlu 245 250 255 Asn Gly Arg Tyr Gly Arg Arg Lys Gln Tyr Pro Ile Ser LeuVal Leu 260 265 270 Ala Pro Thr Arg Glu Leu Ala Val Gln Ile Tyr Glu GluAla Arg Lys 275 280 285 Phe Ser Tyr Arg Ser Arg Val Arg Pro Cys Val ValTyr Gly Gly Ala 290 295 300 Asp Ile Gly Gln Gln Ile Arg Asp Leu Glu ArgGly Cys His Leu Leu 305 310 315 320 Val Ala Thr Pro Gly Arg Leu Val AspMet Met Glu Arg Gly Lys Ile 325 330 335 Gly Leu Asp Phe Cys Lys Tyr LeuVal Leu Asp Glu Ala Asp Arg Met 340 345 350 Leu Asp Met Gly Phe Glu ProGln Ile Arg Arg Ile Val Glu Gln Asp 355 360 365 Thr Met Pro Pro Lys GlyVal Arg His Thr Met Met Phe Ser Ala Thr 370 375 380 Phe Pro Lys Glu IleGln Met Leu Ala Arg Asp Phe Leu Asp Glu Tyr 385 390 395 400 Ile Phe LeuAla Val Gly Arg Val Gly Ser Thr Ser Glu Asn Ile Thr 405 410 415 Gln LysVal Val Trp Val Glu Glu Ser Asp Lys Arg Ser Phe Leu Leu 420 425 430 AspLeu Leu Asn Ala Thr Gly Lys Asp Ser Leu Thr Leu Val Phe Val 435 440 445Glu Thr Lys Lys Gly Ala Asp Ser Leu Glu Asp Phe Leu Tyr His Glu 450 455460 Gly Tyr Ala Cys Thr Ser Ile His Gly Asp Arg Ser Gln Arg Asp Arg 465470 475 480 Glu Glu Ala Leu His Gln Phe Arg Ser Gly Lys Ser Pro Ile LeuVal 485 490 495 Ala Thr Ala Val Ala Ala Arg Gly Leu Asp Ile Ser Asn ValLys His 500 505 510 Val Ile Asn Phe Asp Leu Pro Ser Asp Ile Glu Glu TyrVal His Arg 515 520 525 Ile Gly Arg Thr Gly Arg Val Gly Asn Leu Gly LeuAla Thr Ser Phe 530 535 540 Phe Asn Glu Arg Asn Ile Asn Ile Thr Lys AspLeu Leu Asp Leu Leu 545 550 555 560 Val Glu Ala Lys Gln Glu Val Pro SerTrp Leu Glu Asn Met Ala Tyr 565 570 575 Glu His His Tyr Lys Gly Ser SerArg Gly Arg Ser Lys Ser Ser Arg 580 585 590 Phe Ser Gly Gly Phe Gly AlaArg Asp Tyr Arg Gln Ser Ser Gly Ala 595 600 605 Ser Ser Ser Ser Phe SerSer Ser Arg Ala Ser Ser Ser Arg Ser Gly 610 615 620 Gly Gly Gly His GlySer Ser Arg Gly Phe Gly Gly Gly Gly Tyr Gly 625 630 635 640 Gly Phe TyrAsn Ser Asp Gly Tyr Gly Gly Asn Tyr Asn Ser Gln Gly 645 650 655 Val AspTrp Trp Gly Asn 660 7 29 DNA Artificial Sequence oligonucleotidepGBT9(+2) 7 tcgccggaat tgaattcccg gggatccgt 29 8 24 DNA ArtificialSequence oligonucleotide pGBT9 8 tcgccggaat tcccggggat ccgt 24 9 20 DNAArtificial Sequence oligonucleotide 124 9 cgaggtctga ggatgatctt 20 10 20DNA Artificial Sequence oligonucleotide 125 10 ctgagaaagt ggcgttctct 2011 26 DNA Artificial Sequence oligonucleotide Top3XhoI 11 aagttactcgagatggccct ccgagg 26 12 27 DNA Artificial Sequence oligonucleotideTop3Hind3 12 acgagcaagc ttctctaccc taccctg 27 13 38 DNA ArtificialSequence oligonucleotide PCS1 13 aattgcgaat tctcgagccc ggggatccgtcgactgca 38 14 30 DNA Artificial Sequence oligonucleotide PCS2 14gtcgcaggat ccccgggctc gagaattcgc 30 15 18 DNA Artificial Sequenceoligonucleotide GALT4 15 ccactacaat ggatgatg 18

What is claimed is:
 1. An isolated nucleic acid molecule comprising thenucleotide sequence of SEQ ID NO:3.
 2. A vector comprising the isolatednucleic acid molecule of claim
 1. 3. An expression vector comprising theisolated nucleic acid molecule of claim 1 under the control of apromoter.
 4. A host cell transformed with the expression vector of claim3.
 5. A method for producing a polypeptide comprising the amino acidsequence of SEQ ID NO:4, comprising the steps of: (a) culturing the hostcell of claim 4 under conditions that permit production of thepolypeptide; and (b) collecting the polypeptide from the culture, thehost cell, or a combination thereof.
 6. An isolated nucleic acidmolecule that encodes a polypeptide comprising the amino acid sequenceof SEQ ID NO:4.
 7. A vector comprising the isolated nucleic acidmolecule of claim
 6. 8. An expression vector comprising the isolatednucleic acid molecule of claim 6 under the control of a promoter.
 9. Ahost cell transformed with the expression vector of claim
 8. 10. Amethod for producing an isolated polypeptide comprising the amino acidsequence of SEQ ID NO:4, comprising the steps of: (a) culturing the hostcell of claim 9 under conditions that permit production of thepolypeptide; and (b) collecting the polypeptide from the culture, thehost cell, or a combination thereof.